Dylan Mumm

CPSC 3300

Homework 5

Due 11:59 PM Monday, April 15

1. Consider an array declared in C as "double a[100];". How many 64-byte cache lines are required to hold the complete array? [8pts]
   * Cache lines = cache size / line size
   * Cache lines = sizeof(a) / 64 bytes
   * Cache lines = 100\*8 / 64
   * Cache lines = 12.5, round up
   * Cache lines = 13, 64-byte cache lines
2. Consider the byte address 0x002468ac. What is the value modulo 64? (That is, what is the offset of this address within a 64-byte block?) [8pts]
   * 0000 0000 0010 0100 0110 1000 1010 1100
   * 0000 0000 0000 0000 1001 0001 1010 0010 0010 1100
   * 0x0000002c
3. Consider the byte address 0x002468ac. What is the value shifted to the right by 6 bits? (That is, what is the block address corresponding to this byte address when using 64-byte blocks?) [8pts]
   * 0000 0000 0010 0100 0110 1000 1010 1100
   * 0000 0000 0000 0000 1001 0001 1010 0010
   * 0x000091a2
4. Consider matrix transpose written in C. Which array is exhibiting spatial locality: array "a", "b", or both? (Note that NROWS and NCOLS could each be relatively large compared to the size of the cache.) [8pts]

for(i=0;i<NROWS;i++){

for(j=0;j<NCOLS;j++){

b[i][j] = a[j][i];

}

}

* + Array “b”, not “a”

1. Consider a 4 GB byte-addressable main memory (32-bit address) with a level-1 data cache that is eight-way set-associative, 32 KB in size, with 64-byte block size. [24pts (8pts each)]
2. How many total blocks are there in cache?
   * 512 lines
3. How many sets are there?
4. Show how the main memory address is partitioned into fields for the cache access and give the bit lengths of these fields.
   * Index = 6 bit
   * Offset = 6 bit
   * Tag = 20 bit
5. Consider a direct-mapped data cache design in which a 32-bit address is divided into these three fields: 18-bit tag, 10-bit index, and 4-bit offset. [24pts (6pts each)]
6. How large is a line in number of bytes?
   * 2^4 = 16
7. How many lines are in the cache?
   * 2^10 = 1024
8. How large is the cache in number of bytes?
   * 16384
9. For the following segment of code written in C, where "sum" and the array "a" are typed as 4-byte integers, what is the miss rate?

(Assume the variable "sum" and the loop index "i" are register-allocated by the compiler within the body of the loop and thus do not cause data cache accesses within the loop.)

for(i=0;i<4096;i++)

sum = sum + a[i];

* + The miss rate is 50%, assuming the cache is initially empty.

1. Assume a 256-byte main memory and a four-line cache with four bytes Per line. The cache is initially empty. For the byte address reference stream (reads) given below circle which of the references are hits for the different cache placement schemes. Also, show the final contents of the cache. (The byte addresses are in decimal.) [20pts (10pts each)]

a) direct-mapped

0, 16, 8, 1, 10, 30, 18, 29, 2, 25

|  |  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 0 | 16 | 8 | 1 | 10 | 30 | 18 | 29 | 2 | 25 | Final |
| Line 0b00 |  |  |  |  |  |  |  |  |  |  |  |
| Line 0b01 |  |  |  |  |  |  |  |  |  |  |  |
| Line 0b10 |  |  |  |  |  |  |  |  |  |  |  |
| Line 0b11 |  |  |  |  |  |  |  |  |  |  |  |

b) fully-associative with first-in-first-out replacement

0, 16, 8, 1, 10, 30, 18, 29, 2, 25